Picture for Zhennan Lin

Zhennan Lin

SoulX-Transcriber: A Robust End-to-End Framework for Multi-Speaker Speech Transcription

Add code
Jun 01, 2026
Viaarxiv icon

Towards Fine-Grained Multi-Dimensional Speech Understanding: Data Pipeline, Benchmark, and Model

Add code
May 12, 2026
Viaarxiv icon

Listening with Time: Precise Temporal Awareness for Long-Form Audio Understanding

Add code
Apr 24, 2026
Viaarxiv icon

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

Add code
Apr 03, 2026
Viaarxiv icon

MSU-Bench: Towards Understanding the Conversational Multi-talker Scenarios

Add code
Aug 11, 2025
Viaarxiv icon

Leveraging LLM and Self-Supervised Training Models for Speech Recognition in Chinese Dialects: A Comparative Analysis

Add code
May 27, 2025
Viaarxiv icon